164 research outputs found
Linearity Testing Against a Fuzzy Rule-based Model
In this paper, we introduce a linearity test for fuzzy rule-based models in the framework of time series modeling. To do so, we explore a family of statistical models, the regime switching autoregressive models, and the relations that link them to the fuzzy rule-based models. From these relations, we derive a Lagrange Multiplier linearity test and some properties of the maximum likelihood estimator needed for it. Finally, an empirical study of the goodness of the test is presented.fuzzy rule-based models, time series, linearity test, statistical inference
Testing for Heteroskedasticity of the Residuals in Fuzzy Rule-Based Models
In this paper, we propose a new diagnostic checking tool
for fuzzy rule-based modelling of time series. Through the study of the
residuals in the Lagrange Multiplier testing framework we devise a hypothesis
test which allows us to determine if the residual time series is
homoscedastic or not, that is, if it has the same variance throughout time.
This is another important step towards a statistically sound modelling
strategy for fuzzy rule-based models.Spanish Ministerio
de Ciencia e Innovaci´on (MICINN) under Project grants MICINN TIN2009-
14575 and CIT-460000-2009-4
Semantics of Data Mining Services in Cloud Computing
M. Parra-Royon holds a "Excelencia" scholarship from the Regional Government of Andaluc a (Spain).
This work was supported by the Research Projects P12-TIC-2958 and TIN2016-81113-R (Ministry of
Economy, Industry and Competitiveness - Government of Spain).In recent years with the rise of Cloud Computing (CC), many companies providing services in the cloud, are empowering a new series of services to their catalogue, such as data mining (DM) and data processing (DP), taking advantage of the vast computing resources available to them. Different service definition proposals have been put forward to address the problem of describing services in CC in a comprehensive way. Bearing in mind that each provider has its own definition of the logic of its services, and specifically of DM services, it should be pointed out that the possibility of describing services in a flexible way between providers is fundamental in order to maintain the usability and portability of this type of CC services. The use of semantic technologies based on the proposal offered by Linked Data (LD) for the definition of services, allows the design and modelling of DM services, achieving a high degree of interoperability. In this article a schema for the definition of DM services on CC is presented considering all key aspects of service in CC, such as prices, interfaces, Software Level Agreement (SLA), instances or DM work ow, among others. The new schema is based on LD, and it reuses other schemata obtaining a better and more complete definition of the services. In order to validate the completeness of the scheme, a series of DM services have been created where a set of algorithms such as Random Forest (RF) or KMeans are modeled as services. In addition, a dataset has been generated including the definition of the services of several actual CC DM providers, conforming the effectiveness of the schema.P12-TIC-2958 and TIN2016-81113-R (Ministry of
Economy, Industry and Competitiveness - Government of Spain
Fuzzy Systems-as-a-Service in Cloud Computing
Fuzzy systems have become widely accepted and applied in a host of domains such as control, electronics or mechanics. The
software for construction of these systems has traditionally been exploited from tools, platforms and languages run on-premise
computing infrastructure. On the other hand, rise and ubiquity of the cloud computing model has brought a revolutionary way
for computing services deployment. The boost of cloud services is leading towards increasingly specific service offering just
as data mining and machine learning service. Unfortunately, so far, no definition for fuzzy system as service is available. This
paper identifies this opportunity and focus on developing a proposal for fuzzy system-as-a-service definition. To achieve this, the
proposal pursues three objectives: the complete description of cloud services for fuzzy systems using semantic technology, the
composition of services and the exploitation of the model in cloud platforms for integration with other services. As an illustrative
case, a real-world problem is addressed with the proposed specification.This work was supported by the Research
Projects P12-TIC-2958 and TIN2016-81113-R (Ministry of Economy,
Industry and Competitiveness - Government of Spain)
Neural Networks in R Using the Stuttgart Neural Network Simulator: RSNNS
Neural networks are important standard machine learning procedures for classification and regression. We describe the R package RSNNS that provides a convenient interface to the popular Stuttgart Neural Network Simulator SNNS. The main features are (a) encapsulation of the relevant SNNS parts in a C++ class, for sequential and parallel usage of different networks, (b) accessibility of all of the SNNS algorithmic functionality from R using a low-level interface, and (c) a high-level interface for convenient, R-style usage of many standard neural network procedures. The package also includes functions for visualization and analysis of the models and the training procedures, as well as functions for data input/output from/to the original SNNS file formats.This work was supported in part by the Spanish Ministry of Science and Innovation (MICINN) under Project TIN-2009-14575. C. Bergmeir holds a scholarship from the Spanish Ministry of Education (MEC) of the \Programa de Formación del Profesorado Universitario (FPU)"
Overall quality optimization for DQM stage in High Energy Physics experiments
Data Acquisition (DAQ) and Data Quality Monitoring (DQM) are key parts in
the HEP data chain, where the data are processed and analyzed to obtain accurate monitoring
quality indicators. Such stages are complex, including an intense processing work-flow and
requiring a high degree of interoperability between software and hardware facilities. Data
recorded by DAQ sensors and devices are sampled to perform live (and offline) DQM of the
status of the detector during data collection providing to the system and scientists the ability
to identify problems with extremely low latency, minimizing the amount of data that would
otherwise be unsuitable for physical analysis. DQM stage performs a large set of operations
(Fast Fourier Transform (FFT), clustering, classification algorithms, Region of Interest, particles
tracking, etc.) involving the use of computing resources and time, depending on the number of
events of the experiment, sampling data, complexity of the tasks or the quality performance. The
objective of our work is to show a proposal with aim of developing a general optimization of the
DQM stage considering all these elements. Techniques based on computational intelligence like
EA can help improve the performance and therefore achieve an optimization of task scheduling
in DQM.(MINECO - Gov. of Spain)
P12-TIC-2958
TIN2016-81113-
SCMFTS: Scalable and Distributed Complexity Measures and Features for Univariate and Multivariate Time Series in Big Data Environments
This research has been partially funded by the following grants: TIN2016-81113-R from the Spanish Ministry of Economy and Competitiveness, P12-TIC-2985 and P18-TP-5168 from Andalusian Regional Government, Spain, and EU Commission with FEDER funds. Francisco J. Baldan holds the FPI grant BES-2017-080137 from the Spanish Ministry of Economy and Competitiveness. D. Peralta is a Postdoctoral Fellow of the Research Foundation of Flanders (170303/12X1619N). Y. Saeys is an ISAC Marylou Ingram Scholar.Time series data are becoming increasingly important due to the interconnectedness of the world. Classical problems, which
are getting bigger and bigger, require more and more resources for their processing, and Big Data technologies offer many
solutions. Although the principal algorithms for traditional vector-based problems are available in Big Data environments,
the lack of tools for time series processing in these environments needs to be addressed. In this work, we propose a scalable
and distributed time series transformation for Big Data environments based on well-known time series features (SCMFTS),
which allows practitioners to apply traditional vector-based algorithms to time series problems. The proposed transformation,
along with the algorithms available in Spark, improved the best results in the state-of-the-art on the Wearable Stress
and Affect Detection dataset, which is the biggest publicly available multivariate time series dataset in the University of
California Irvine (UCI) Machine Learning Repository. In addition, SCMFTS showed a linear relationship between its runtime
and the number of processed time series, demonstrating a linear scalable behavior, which is mandatory in Big Data
environments. SCMFTS has been implemented in the Scala programming language for the Apache Spark framework, and
the code is publicly available.Spanish Government TIN2016-81113-R
BES-2017-080137Andalusian Regional Government, Spain P12-TIC-2985
P18-TP-5168European Commission
European Commission Joint Research Centre
European Commissio
Complexity Measures and Features for Times Series classification
Classification of time series is a growing problem in different disciplines due
to the progressive digitalization of the world. Currently, the state-of-the-art
in time series classification is dominated by The Hierarchical Vote Collective
of Transformation-based Ensembles. This algorithm is composed of several
classifiers of different domains distributed in five large modules. The combination
of the results obtained by each module weighed based on an internal evaluation
process allows this algorithm to obtain the best results in state-of-the-art. One
Nearest Neighbour with Dynamic Time Warping remains the base classifier
in any time series classification problem for its simplicity and good results.
Despite their performance, they share a weakness, which is that they are not
interpretable. In the field of time series classification, there is a tradeoff between
accuracy and interpretability. In this work, we propose a set of characteristics
capable of extracting information on the structure of the time series to face time
series classification problems. The use of these characteristics allows the use of
traditional classification algorithms in time series problems. The experimental
results of our proposal show no statistically significant differences from the second
and third best models of the state-of-the-art. Apart from competitive results in
accuracy, our proposal is able to offer interpretable results based on the set of
characteristics proposed.Spanish Government TIN2016-81113-R
PID2020-118224RB-I00
BES-2017-080137Andalusian Regional Government, Spain P12-TIC-2958
P18-TP-5168
A-TIC-388-UGR-1
Ability to Predict Side-Out Performance by the Setter’s Action Range with First Tempo Availability in Top European Male and Female Teams
The aims of this study were to compare the Setter’s action range with availability of first
tempo (SARA) between male and female volleyball; and to determine the relationship between
several spatial and o ensive variables and their influence in the success of the side-out in male and
female volleyball. A total of 1302 side-outs (639 male, 663 female) were registered (2019 European
Championship). The ranking, reception e cacy, position and trajectory of the setter between reception
and set, first tempo availability, side-out result, rotation, and attack lane were analyzed through
Recursive Partitioning for classification, regression and survival tree models and classification and
regression trees algorithms. Our results present female teams with more reduced SARAs than male
teams, meaning female setters tend to play closer to the net. The correlation between the ranking
and the distance from the average position of the setter to the ideal setting zone was not significant.
A movement of the setter of 30 or less and more than 1 m in distance might improve the performance
of the side-out. Depending on the spatial usage of the setter, some rotations might be more successful
than others. When assessing performance, the teams should consider the ability to play quick attacks
when their reception is not as precise as they would expect.German Research Foundation (DFG) FPU14/02234Spanish Ministry of Economy and Competitiveness - Spanish Ministry of Economy, Industry and Competitivity
DEP2011-27503 TIN2016-81113-RFEDER-Junta de Andalucia, Consejeria de Economia y Conocimiento
TIC.388.UGR1
Memetic Algorithms with Local Search Chains in R: The Rmalschains Package
Global optimization is an important field of research both in mathematics and computer sciences. It has applications in nearly all fields of modern science and engineering. Memetic algorithms are powerful problem solvers in the domain of continuous optimization, as they offer a trade-off between exploration of the search space using an evolutionary algorithm scheme, and focused exploitation of promising regions with a local search algorithm. In particular, we describe the memetic algorithms with local search chains (MA-LS-Chains) paradigm, and the R package Rmalschains, which implements them. MA-LS-Chains has proven to be effective compared to other algorithms, especially in high-dimensional problem solving. In an experimental study, we demonstrate the advantages of using Rmalschains for high-dimension optimization problems in comparison to other optimization methods already available in R.This work was supported in part by the Spanish Ministry of Science and Innovation (MICINN)
under Project TIN-2009-14575. The work was performed while C. Bergmeir held a scholarship
from the Spanish Ministry of Education (MEC) of the “Programa de Formación del
Profesorado Universitario (FPU)”
- …